課程資訊
課程名稱
數據分析之計算統計學
Computational Statistics for Data Analytics 
開課學期
111-1 
授課對象
工學院  土木工程學系  
授課教師
汪立本 
課號
CIE5140 
課程識別碼
521 U9270 
班次
 
學分
3.0 
全/半年
半年 
必/選修
選修 
上課時間
星期一6(13:20~14:10)星期四6,7(13:20~15:10) 
上課地點
普501普501 
備註
本課程中文授課,使用英文教科書。須修過「工程統計學」及「計算機程式」。教材、作業及考試題目為英文。
限學士班三年級以上
總人數上限:30人 
 
課程簡介影片
 
核心能力關聯
本課程尚未建立核心能力關連
課程大綱
為確保您我的權利,請尊重智慧財產權及不得非法影印
課程概述

This course is an extension of the Engineering Statistics and Computer Programming courses. Students will work extensively with real-world data (relevant to engineering, physics and environment) during classes. The knowledge learned from the aforementioned two courses will be briefly reviewed and further strengthened through a series of hands-on projects. This course will enable students to develop solid data analytical skills and problem-solving mindsets, which will be useful whether they decide to work in industry or academia in the future. 

課程目標
With the development of sensing and computational technologies, the amount of data that modern engineers have to handle on a daily basis has largely increased. The aim of this course is to provide civil engineering students proper training to ensure that they will be equipped with essential skills to explore unknown data, as well as to develop data scientists’ problem-solving and self-learning mindsets. 
課程要求
Computer programming
Engineering statistics 
預期每週課後學習時數
 
Office Hours
 
指定閱讀
 
參考書目
Larry Wasserman, All of Statistics: A Concise Course in Statistical Inference, Springer, USA, 2004.
Allen B. Downey, Think Bayes: Bayesian Statistics Made Simple, O'Reilly, 2012.
Allen B. Downey, Think Stats: Probability and Statistics for Programmers, O'Reilly, 2014.
Allen B. Downey, Think Stats: Exploratory Data Analysis in Python, O'Reilly, 2014.
Annette J. Dobson & Adrian G. Barnett, An Introduction to Generalized Linear Models, 4th Edition, Chapman & Hall/CRC, 2018.
Christian Onof, Lecture Notes for Statistics, Imperial College London, 2017. 
評量方式
(僅供參考)
   
課程進度
週次
日期
單元主題
Week 1
2022/09/05  Course Intro, Python in a nutshell 
Week 1
2022/09/08  Descriptive Stats, Probability and Random variables  
Week 2
2022/09/12  Python for basic data processing 
Week 2
2022/09/15  Probability distribution 
Week 3
2022/09/19  Scipy.stats for probability distribution and random variable sampling 
Week 3
2022/09/22  Probability distribution fitting, MLE 
Week 4
2022/09/26  MLE fitting: handmade vs. scipy.stats 
Week 4
2022/09/29  Midterm (I): in-class (1h), Multivariable 
Week 5
2022/10/03  Correlated data sampling, Multivariable modelling with Copula 
Week 5
2022/10/06  Confidence intervals 
Week 6
2022/10/10  National Day - No class, Bootstrapping (pre-recorded) 
Week 6
2022/10/13  Statistical test 
Week 7
2022/10/17  Goodness-fit test 
Week 7
2022/10/20  Bayesian inference 
Week 8
2022/10/24  PyMC3 for Bayesian inference 
Week 8
2022/10/27  Midterm (II): take-home (2022/10/24 – 2022/10/31) 
Week 9
2022/10/31  Midterm (II): take-home (2022/10/24 – 2022/10/31) 
Week 9
2022/11/03  Working with open/public datasets and scientific data files 
Week 10
2022/11/07  NetCDF, HDF5 file processing 
Week 10
2022/11/10  Spatial Statistics (I): Variogram 
Week 11
2022/11/14  Variogrm for spatial association, spatial random field generation 
Week 11
2022/11/17  Spatial Statistics (II): Kriging 
Week 12
2022/11/21  1D and 2D Kriging interpolation 
Week 12
2022/11/24  Bayesian application: Kalman filter 
Week 13
2022/11/28  Filterpy for Kalman filter 
Week 13
2022/12/01  Classification (I) 
Week 14
2022/12/05  Linear, logistic regression 
Week 14
2022/12/08  Classification (II) 
Week 15
2022/12/12  Naïve Bayesian, Support Vector Machine 
Week 15
2022/12/15  Stochastic process: Markov chain, Poisson process (tentative) 
Week 16
2022/12/19  Bartlett-Lewis model for time series modelling
Final assignment (2022/12/19 – 2022/12/26) 
Week 16
2022/12/22  Data ethics (invited talk)